Overview

Dataset statistics

Number of variables26
Number of observations205
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory41.8 KiB
Average record size in memory208.6 B

Variable types

Numeric10
Categorical16

Alerts

normalized-losses has a high cardinality: 52 distinct values High cardinality
horsepower has a high cardinality: 60 distinct values High cardinality
price has a high cardinality: 187 distinct values High cardinality
symboling is highly correlated with wheel-base and 1 other fieldsHigh correlation
wheel-base is highly correlated with symboling and 6 other fieldsHigh correlation
length is highly correlated with wheel-base and 6 other fieldsHigh correlation
width is highly correlated with wheel-base and 5 other fieldsHigh correlation
height is highly correlated with symboling and 2 other fieldsHigh correlation
curb-weight is highly correlated with wheel-base and 5 other fieldsHigh correlation
engine-size is highly correlated with wheel-base and 5 other fieldsHigh correlation
city-mpg is highly correlated with length and 4 other fieldsHigh correlation
highway-mpg is highly correlated with wheel-base and 5 other fieldsHigh correlation
symboling is highly correlated with wheel-base and 1 other fieldsHigh correlation
wheel-base is highly correlated with symboling and 6 other fieldsHigh correlation
length is highly correlated with wheel-base and 5 other fieldsHigh correlation
width is highly correlated with wheel-base and 5 other fieldsHigh correlation
height is highly correlated with symboling and 1 other fieldsHigh correlation
curb-weight is highly correlated with wheel-base and 5 other fieldsHigh correlation
engine-size is highly correlated with wheel-base and 5 other fieldsHigh correlation
city-mpg is highly correlated with length and 4 other fieldsHigh correlation
highway-mpg is highly correlated with wheel-base and 5 other fieldsHigh correlation
wheel-base is highly correlated with length and 3 other fieldsHigh correlation
length is highly correlated with wheel-base and 5 other fieldsHigh correlation
width is highly correlated with wheel-base and 5 other fieldsHigh correlation
curb-weight is highly correlated with wheel-base and 5 other fieldsHigh correlation
engine-size is highly correlated with wheel-base and 5 other fieldsHigh correlation
city-mpg is highly correlated with length and 4 other fieldsHigh correlation
highway-mpg is highly correlated with length and 4 other fieldsHigh correlation
make is highly correlated with num-of-cylinders and 9 other fieldsHigh correlation
aspiration is highly correlated with fuel-system and 4 other fieldsHigh correlation
num-of-doors is highly correlated with body-styleHigh correlation
body-style is highly correlated with num-of-doorsHigh correlation
num-of-cylinders is highly correlated with make and 6 other fieldsHigh correlation
fuel-system is highly correlated with make and 5 other fieldsHigh correlation
stroke is highly correlated with make and 10 other fieldsHigh correlation
horsepower is highly correlated with make and 10 other fieldsHigh correlation
engine-type is highly correlated with make and 5 other fieldsHigh correlation
drive-wheels is highly correlated with make and 2 other fieldsHigh correlation
bore is highly correlated with make and 9 other fieldsHigh correlation
engine-location is highly correlated with make and 4 other fieldsHigh correlation
normalized-losses is highly correlated with make and 1 other fieldsHigh correlation
peak-rpm is highly correlated with make and 8 other fieldsHigh correlation
fuel-type is highly correlated with fuel-system and 4 other fieldsHigh correlation
symboling is highly correlated with normalized-losses and 12 other fieldsHigh correlation
normalized-losses is highly correlated with symboling and 19 other fieldsHigh correlation
make is highly correlated with symboling and 21 other fieldsHigh correlation
fuel-type is highly correlated with aspiration and 6 other fieldsHigh correlation
aspiration is highly correlated with make and 6 other fieldsHigh correlation
num-of-doors is highly correlated with symboling and 5 other fieldsHigh correlation
body-style is highly correlated with normalized-losses and 8 other fieldsHigh correlation
drive-wheels is highly correlated with symboling and 16 other fieldsHigh correlation
engine-location is highly correlated with make and 6 other fieldsHigh correlation
wheel-base is highly correlated with symboling and 19 other fieldsHigh correlation
length is highly correlated with symboling and 19 other fieldsHigh correlation
width is highly correlated with symboling and 18 other fieldsHigh correlation
height is highly correlated with symboling and 19 other fieldsHigh correlation
curb-weight is highly correlated with symboling and 18 other fieldsHigh correlation
engine-type is highly correlated with normalized-losses and 15 other fieldsHigh correlation
num-of-cylinders is highly correlated with normalized-losses and 16 other fieldsHigh correlation
engine-size is highly correlated with normalized-losses and 18 other fieldsHigh correlation
fuel-system is highly correlated with normalized-losses and 18 other fieldsHigh correlation
bore is highly correlated with symboling and 22 other fieldsHigh correlation
stroke is highly correlated with symboling and 22 other fieldsHigh correlation
compression-ratio is highly correlated with make and 15 other fieldsHigh correlation
horsepower is highly correlated with symboling and 23 other fieldsHigh correlation
peak-rpm is highly correlated with symboling and 22 other fieldsHigh correlation
city-mpg is highly correlated with normalized-losses and 15 other fieldsHigh correlation
highway-mpg is highly correlated with normalized-losses and 17 other fieldsHigh correlation
price is uniformly distributed Uniform
symboling has 67 (32.7%) zeros Zeros

Reproduction

Analysis started2022-06-27 07:23:16.555722
Analysis finished2022-06-27 07:23:33.845557
Duration17.29 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

symboling
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8341463415
Minimum-2
Maximum3
Zeros67
Zeros (%)32.7%
Negative25
Negative (%)12.2%
Memory size1.7 KiB
2022-06-27T12:53:33.916117image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-2
5-th percentile-1
Q10
median1
Q32
95-th percentile3
Maximum3
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.245306828
Coefficient of variation (CV)1.492911695
Kurtosis-0.6762713562
Mean0.8341463415
Median Absolute Deviation (MAD)1
Skewness0.2110722721
Sum171
Variance1.550789096
MonotonicityNot monotonic
2022-06-27T12:53:34.003693image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
067
32.7%
154
26.3%
232
15.6%
327
13.2%
-122
 
10.7%
-23
 
1.5%
ValueCountFrequency (%)
-23
 
1.5%
-122
 
10.7%
067
32.7%
154
26.3%
232
15.6%
327
13.2%
ValueCountFrequency (%)
327
13.2%
232
15.6%
154
26.3%
067
32.7%
-122
 
10.7%
-23
 
1.5%

normalized-losses
Categorical

HIGH CARDINALITY
HIGH CORRELATION
HIGH CORRELATION

Distinct52
Distinct (%)25.4%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
?
41 
161
 
11
91
 
8
150
 
7
134
 
6
Other values (47)
132 

Length

Max length3
Median length3
Mean length2.356097561
Min length1

Characters and Unicode

Total characters483
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)4.9%

Sample

1st row?
2nd row?
3rd row?
4th row164
5th row164

Common Values

ValueCountFrequency (%)
?41
 
20.0%
16111
 
5.4%
918
 
3.9%
1507
 
3.4%
1346
 
2.9%
1286
 
2.9%
1046
 
2.9%
855
 
2.4%
945
 
2.4%
655
 
2.4%
Other values (42)105
51.2%

Length

2022-06-27T12:53:34.109283image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
41
 
20.0%
16111
 
5.4%
918
 
3.9%
1507
 
3.4%
1346
 
2.9%
1286
 
2.9%
1046
 
2.9%
745
 
2.4%
955
 
2.4%
1035
 
2.4%
Other values (42)105
51.2%

Most occurring characters

ValueCountFrequency (%)
1151
31.3%
844
 
9.1%
?41
 
8.5%
538
 
7.9%
936
 
7.5%
036
 
7.5%
436
 
7.5%
230
 
6.2%
629
 
6.0%
326
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number442
91.5%
Other Punctuation41
 
8.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1151
34.2%
844
 
10.0%
538
 
8.6%
936
 
8.1%
036
 
8.1%
436
 
8.1%
230
 
6.8%
629
 
6.6%
326
 
5.9%
716
 
3.6%
Other Punctuation
ValueCountFrequency (%)
?41
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common483
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1151
31.3%
844
 
9.1%
?41
 
8.5%
538
 
7.9%
936
 
7.5%
036
 
7.5%
436
 
7.5%
230
 
6.2%
629
 
6.0%
326
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII483
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1151
31.3%
844
 
9.1%
?41
 
8.5%
538
 
7.9%
936
 
7.5%
036
 
7.5%
436
 
7.5%
230
 
6.2%
629
 
6.0%
326
 
5.4%

make
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct22
Distinct (%)10.7%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
toyota
32 
nissan
18 
mazda
17 
mitsubishi
13 
honda
13 
Other values (17)
112 

Length

Max length13
Median length11
Mean length6.47804878
Min length3

Characters and Unicode

Total characters1328
Distinct characters25
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st rowalfa-romero
2nd rowalfa-romero
3rd rowalfa-romero
4th rowaudi
5th rowaudi

Common Values

ValueCountFrequency (%)
toyota32
15.6%
nissan18
 
8.8%
mazda17
 
8.3%
mitsubishi13
 
6.3%
honda13
 
6.3%
volkswagen12
 
5.9%
subaru12
 
5.9%
peugot11
 
5.4%
volvo11
 
5.4%
dodge9
 
4.4%
Other values (12)57
27.8%

Length

2022-06-27T12:53:34.220879image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
toyota32
15.6%
nissan18
 
8.8%
mazda17
 
8.3%
mitsubishi13
 
6.3%
honda13
 
6.3%
volkswagen12
 
5.9%
subaru12
 
5.9%
peugot11
 
5.4%
volvo11
 
5.4%
dodge9
 
4.4%
Other values (12)57
27.8%

Most occurring characters

ValueCountFrequency (%)
a154
 
11.6%
o152
 
11.4%
s109
 
8.2%
t100
 
7.5%
e81
 
6.1%
u76
 
5.7%
n71
 
5.3%
i68
 
5.1%
d63
 
4.7%
m57
 
4.3%
Other values (15)397
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1317
99.2%
Dash Punctuation11
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a154
 
11.7%
o152
 
11.5%
s109
 
8.3%
t100
 
7.6%
e81
 
6.2%
u76
 
5.8%
n71
 
5.4%
i68
 
5.2%
d63
 
4.8%
m57
 
4.3%
Other values (14)386
29.3%
Dash Punctuation
ValueCountFrequency (%)
-11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1317
99.2%
Common11
 
0.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a154
 
11.7%
o152
 
11.5%
s109
 
8.3%
t100
 
7.6%
e81
 
6.2%
u76
 
5.8%
n71
 
5.4%
i68
 
5.2%
d63
 
4.8%
m57
 
4.3%
Other values (14)386
29.3%
Common
ValueCountFrequency (%)
-11
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1328
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a154
 
11.6%
o152
 
11.4%
s109
 
8.2%
t100
 
7.5%
e81
 
6.1%
u76
 
5.7%
n71
 
5.3%
i68
 
5.1%
d63
 
4.7%
m57
 
4.3%
Other values (15)397
29.9%

fuel-type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
gas
185 
diesel
20 

Length

Max length6
Median length3
Mean length3.292682927
Min length3

Characters and Unicode

Total characters675
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowgas
2nd rowgas
3rd rowgas
4th rowgas
5th rowgas

Common Values

ValueCountFrequency (%)
gas185
90.2%
diesel20
 
9.8%

Length

2022-06-27T12:53:34.314960image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-27T12:53:34.429558image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
gas185
90.2%
diesel20
 
9.8%

Most occurring characters

ValueCountFrequency (%)
s205
30.4%
g185
27.4%
a185
27.4%
e40
 
5.9%
d20
 
3.0%
i20
 
3.0%
l20
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter675
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s205
30.4%
g185
27.4%
a185
27.4%
e40
 
5.9%
d20
 
3.0%
i20
 
3.0%
l20
 
3.0%

Most occurring scripts

ValueCountFrequency (%)
Latin675
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s205
30.4%
g185
27.4%
a185
27.4%
e40
 
5.9%
d20
 
3.0%
i20
 
3.0%
l20
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII675
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s205
30.4%
g185
27.4%
a185
27.4%
e40
 
5.9%
d20
 
3.0%
i20
 
3.0%
l20
 
3.0%

aspiration
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
std
168 
turbo
37 

Length

Max length5
Median length3
Mean length3.36097561
Min length3

Characters and Unicode

Total characters689
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowstd
2nd rowstd
3rd rowstd
4th rowstd
5th rowstd

Common Values

ValueCountFrequency (%)
std168
82.0%
turbo37
 
18.0%

Length

2022-06-27T12:53:34.522638image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-27T12:53:34.630230image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
std168
82.0%
turbo37
 
18.0%

Most occurring characters

ValueCountFrequency (%)
t205
29.8%
s168
24.4%
d168
24.4%
u37
 
5.4%
r37
 
5.4%
b37
 
5.4%
o37
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter689
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t205
29.8%
s168
24.4%
d168
24.4%
u37
 
5.4%
r37
 
5.4%
b37
 
5.4%
o37
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
Latin689
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t205
29.8%
s168
24.4%
d168
24.4%
u37
 
5.4%
r37
 
5.4%
b37
 
5.4%
o37
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII689
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t205
29.8%
s168
24.4%
d168
24.4%
u37
 
5.4%
r37
 
5.4%
b37
 
5.4%
o37
 
5.4%

num-of-doors
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
four
114 
two
89 
?
 
2

Length

Max length4
Median length4
Mean length3.536585366
Min length1

Characters and Unicode

Total characters725
Distinct characters7
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtwo
2nd rowtwo
3rd rowtwo
4th rowfour
5th rowfour

Common Values

ValueCountFrequency (%)
four114
55.6%
two89
43.4%
?2
 
1.0%

Length

2022-06-27T12:53:34.722810image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-27T12:53:34.828400image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
four114
55.6%
two89
43.4%
2
 
1.0%

Most occurring characters

ValueCountFrequency (%)
o203
28.0%
f114
15.7%
u114
15.7%
r114
15.7%
t89
12.3%
w89
12.3%
?2
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter723
99.7%
Other Punctuation2
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o203
28.1%
f114
15.8%
u114
15.8%
r114
15.8%
t89
12.3%
w89
12.3%
Other Punctuation
ValueCountFrequency (%)
?2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin723
99.7%
Common2
 
0.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
o203
28.1%
f114
15.8%
u114
15.8%
r114
15.8%
t89
12.3%
w89
12.3%
Common
ValueCountFrequency (%)
?2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII725
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o203
28.0%
f114
15.7%
u114
15.7%
r114
15.7%
t89
12.3%
w89
12.3%
?2
 
0.3%

body-style
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
sedan
96 
hatchback
70 
wagon
25 
hardtop
 
8
convertible
 
6

Length

Max length11
Median length5
Mean length6.619512195
Min length5

Characters and Unicode

Total characters1357
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowconvertible
2nd rowconvertible
3rd rowhatchback
4th rowsedan
5th rowsedan

Common Values

ValueCountFrequency (%)
sedan96
46.8%
hatchback70
34.1%
wagon25
 
12.2%
hardtop8
 
3.9%
convertible6
 
2.9%

Length

2022-06-27T12:53:34.927986image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-27T12:53:35.052593image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
sedan96
46.8%
hatchback70
34.1%
wagon25
 
12.2%
hardtop8
 
3.9%
convertible6
 
2.9%

Most occurring characters

ValueCountFrequency (%)
a269
19.8%
h148
10.9%
c146
10.8%
n127
9.4%
e108
8.0%
d104
 
7.7%
s96
 
7.1%
t84
 
6.2%
b76
 
5.6%
k70
 
5.2%
Other values (8)129
9.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1357
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a269
19.8%
h148
10.9%
c146
10.8%
n127
9.4%
e108
8.0%
d104
 
7.7%
s96
 
7.1%
t84
 
6.2%
b76
 
5.6%
k70
 
5.2%
Other values (8)129
9.5%

Most occurring scripts

ValueCountFrequency (%)
Latin1357
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a269
19.8%
h148
10.9%
c146
10.8%
n127
9.4%
e108
8.0%
d104
 
7.7%
s96
 
7.1%
t84
 
6.2%
b76
 
5.6%
k70
 
5.2%
Other values (8)129
9.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII1357
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a269
19.8%
h148
10.9%
c146
10.8%
n127
9.4%
e108
8.0%
d104
 
7.7%
s96
 
7.1%
t84
 
6.2%
b76
 
5.6%
k70
 
5.2%
Other values (8)129
9.5%

drive-wheels
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
fwd
120 
rwd
76 
4wd
 
9

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters615
Distinct characters5
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowrwd
2nd rowrwd
3rd rowrwd
4th rowfwd
5th row4wd

Common Values

ValueCountFrequency (%)
fwd120
58.5%
rwd76
37.1%
4wd9
 
4.4%

Length

2022-06-27T12:53:35.153679image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-27T12:53:35.254766image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
fwd120
58.5%
rwd76
37.1%
4wd9
 
4.4%

Most occurring characters

ValueCountFrequency (%)
w205
33.3%
d205
33.3%
f120
19.5%
r76
 
12.4%
49
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter606
98.5%
Decimal Number9
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
w205
33.8%
d205
33.8%
f120
19.8%
r76
 
12.5%
Decimal Number
ValueCountFrequency (%)
49
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin606
98.5%
Common9
 
1.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
w205
33.8%
d205
33.8%
f120
19.8%
r76
 
12.5%
Common
ValueCountFrequency (%)
49
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII615
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
w205
33.3%
d205
33.3%
f120
19.5%
r76
 
12.4%
49
 
1.5%

engine-location
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
front
202 
rear
 
3

Length

Max length5
Median length5
Mean length4.985365854
Min length4

Characters and Unicode

Total characters1022
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfront
2nd rowfront
3rd rowfront
4th rowfront
5th rowfront

Common Values

ValueCountFrequency (%)
front202
98.5%
rear3
 
1.5%

Length

2022-06-27T12:53:35.344844image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-27T12:53:35.450433image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
front202
98.5%
rear3
 
1.5%

Most occurring characters

ValueCountFrequency (%)
r208
20.4%
f202
19.8%
o202
19.8%
n202
19.8%
t202
19.8%
e3
 
0.3%
a3
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1022
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r208
20.4%
f202
19.8%
o202
19.8%
n202
19.8%
t202
19.8%
e3
 
0.3%
a3
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin1022
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r208
20.4%
f202
19.8%
o202
19.8%
n202
19.8%
t202
19.8%
e3
 
0.3%
a3
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1022
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r208
20.4%
f202
19.8%
o202
19.8%
n202
19.8%
t202
19.8%
e3
 
0.3%
a3
 
0.3%

wheel-base
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct53
Distinct (%)25.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean98.75658537
Minimum86.6
Maximum120.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-06-27T12:53:35.576042image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum86.6
5-th percentile93.02
Q194.5
median97
Q3102.4
95-th percentile110
Maximum120.9
Range34.3
Interquartile range (IQR)7.9

Descriptive statistics

Standard deviation6.021775685
Coefficient of variation (CV)0.06097594062
Kurtosis1.017038946
Mean98.75658537
Median Absolute Deviation (MAD)2.7
Skewness1.050213776
Sum20245.1
Variance36.2617824
MonotonicityNot monotonic
2022-06-27T12:53:35.915833image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
94.521
 
10.2%
93.720
 
9.8%
95.713
 
6.3%
96.58
 
3.9%
97.37
 
3.4%
98.47
 
3.4%
104.36
 
2.9%
100.46
 
2.9%
107.96
 
2.9%
98.86
 
2.9%
Other values (43)105
51.2%
ValueCountFrequency (%)
86.62
 
1.0%
88.41
 
0.5%
88.62
 
1.0%
89.53
 
1.5%
91.32
 
1.0%
931
 
0.5%
93.15
 
2.4%
93.31
 
0.5%
93.720
9.8%
94.31
 
0.5%
ValueCountFrequency (%)
120.91
 
0.5%
115.62
 
1.0%
114.24
2.0%
1132
 
1.0%
1121
 
0.5%
1103
1.5%
109.15
2.4%
1081
 
0.5%
107.96
2.9%
106.71
 
0.5%

length
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct75
Distinct (%)36.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean174.0492683
Minimum141.1
Maximum208.1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-06-27T12:53:36.039940image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum141.1
5-th percentile157.14
Q1166.3
median173.2
Q3183.1
95-th percentile196.36
Maximum208.1
Range67
Interquartile range (IQR)16.8

Descriptive statistics

Standard deviation12.33728853
Coefficient of variation (CV)0.0708838862
Kurtosis-0.08289485345
Mean174.0492683
Median Absolute Deviation (MAD)6.9
Skewness0.1559537713
Sum35680.1
Variance152.2086882
MonotonicityNot monotonic
2022-06-27T12:53:36.166548image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
157.315
 
7.3%
188.811
 
5.4%
171.77
 
3.4%
186.77
 
3.4%
166.37
 
3.4%
165.36
 
2.9%
177.86
 
2.9%
176.26
 
2.9%
186.66
 
2.9%
1725
 
2.4%
Other values (65)129
62.9%
ValueCountFrequency (%)
141.11
 
0.5%
144.62
 
1.0%
1503
 
1.5%
155.93
 
1.5%
156.91
 
0.5%
157.11
 
0.5%
157.315
7.3%
157.91
 
0.5%
158.73
 
1.5%
158.81
 
0.5%
ValueCountFrequency (%)
208.11
 
0.5%
202.62
1.0%
199.62
1.0%
199.21
 
0.5%
198.94
2.0%
1971
 
0.5%
193.81
 
0.5%
192.73
1.5%
191.71
 
0.5%
190.92
1.0%

width
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct44
Distinct (%)21.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean65.90780488
Minimum60.3
Maximum72.3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-06-27T12:53:36.288653image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum60.3
5-th percentile63.6
Q164.1
median65.5
Q366.9
95-th percentile70.46
Maximum72.3
Range12
Interquartile range (IQR)2.8

Descriptive statistics

Standard deviation2.145203853
Coefficient of variation (CV)0.03254855562
Kurtosis0.7027642441
Mean65.90780488
Median Absolute Deviation (MAD)1.4
Skewness0.9040034988
Sum13511.1
Variance4.60189957
MonotonicityNot monotonic
2022-06-27T12:53:36.410258image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
63.824
 
11.7%
66.523
 
11.2%
65.415
 
7.3%
63.611
 
5.4%
64.410
 
4.9%
68.410
 
4.9%
649
 
4.4%
65.58
 
3.9%
65.27
 
3.4%
64.26
 
2.9%
Other values (34)82
40.0%
ValueCountFrequency (%)
60.31
 
0.5%
61.81
 
0.5%
62.51
 
0.5%
63.41
 
0.5%
63.611
5.4%
63.824
11.7%
63.93
 
1.5%
649
 
4.4%
64.12
 
1.0%
64.26
 
2.9%
ValueCountFrequency (%)
72.31
 
0.5%
721
 
0.5%
71.73
1.5%
71.43
1.5%
70.91
 
0.5%
70.61
 
0.5%
70.51
 
0.5%
70.33
1.5%
69.62
1.0%
68.94
2.0%

height
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct49
Distinct (%)23.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.72487805
Minimum47.8
Maximum59.8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-06-27T12:53:36.528859image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum47.8
5-th percentile49.7
Q152
median54.1
Q355.5
95-th percentile57.5
Maximum59.8
Range12
Interquartile range (IQR)3.5

Descriptive statistics

Standard deviation2.44352197
Coefficient of variation (CV)0.04548213153
Kurtosis-0.4438123651
Mean53.72487805
Median Absolute Deviation (MAD)1.6
Skewness0.06312273247
Sum11013.6
Variance5.970799617
MonotonicityNot monotonic
2022-06-27T12:53:36.655468image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
50.814
 
6.8%
5212
 
5.9%
55.712
 
5.9%
54.110
 
4.9%
54.510
 
4.9%
55.59
 
4.4%
56.78
 
3.9%
54.38
 
3.9%
52.67
 
3.4%
56.17
 
3.4%
Other values (39)108
52.7%
ValueCountFrequency (%)
47.81
 
0.5%
48.82
 
1.0%
49.42
 
1.0%
49.64
 
2.0%
49.73
 
1.5%
50.26
2.9%
50.52
 
1.0%
50.65
 
2.4%
50.814
6.8%
511
 
0.5%
ValueCountFrequency (%)
59.82
 
1.0%
59.13
 
1.5%
58.74
2.0%
58.31
 
0.5%
57.53
 
1.5%
56.78
3.9%
56.52
 
1.0%
56.32
 
1.0%
56.23
 
1.5%
56.17
3.4%

curb-weight
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct171
Distinct (%)83.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2555.565854
Minimum1488
Maximum4066
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-06-27T12:53:36.777573image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1488
5-th percentile1901
Q12145
median2414
Q32935
95-th percentile3503
Maximum4066
Range2578
Interquartile range (IQR)790

Descriptive statistics

Standard deviation520.6802035
Coefficient of variation (CV)0.2037436064
Kurtosis-0.0428537661
Mean2555.565854
Median Absolute Deviation (MAD)386
Skewness0.6813981891
Sum523891
Variance271107.8743
MonotonicityNot monotonic
2022-06-27T12:53:36.911687image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23854
 
2.0%
19183
 
1.5%
22753
 
1.5%
19893
 
1.5%
24102
 
1.0%
21912
 
1.0%
25352
 
1.0%
20242
 
1.0%
24142
 
1.0%
40662
 
1.0%
Other values (161)180
87.8%
ValueCountFrequency (%)
14881
0.5%
17131
0.5%
18191
0.5%
18371
0.5%
18742
1.0%
18762
1.0%
18891
0.5%
18901
0.5%
19001
0.5%
19051
0.5%
ValueCountFrequency (%)
40662
1.0%
39501
0.5%
39001
0.5%
37701
0.5%
37501
0.5%
37401
0.5%
37151
0.5%
36851
0.5%
35151
0.5%
35051
0.5%

engine-type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
ohc
148 
ohcf
15 
ohcv
 
13
dohc
 
12
l
 
12
Other values (2)
 
5

Length

Max length5
Median length3
Mean length3.126829268
Min length1

Characters and Unicode

Total characters641
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st rowdohc
2nd rowdohc
3rd rowohcv
4th rowohc
5th rowohc

Common Values

ValueCountFrequency (%)
ohc148
72.2%
ohcf15
 
7.3%
ohcv13
 
6.3%
dohc12
 
5.9%
l12
 
5.9%
rotor4
 
2.0%
dohcv1
 
0.5%

Length

2022-06-27T12:53:37.023283image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-27T12:53:37.134879image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
ohc148
72.2%
ohcf15
 
7.3%
ohcv13
 
6.3%
dohc12
 
5.9%
l12
 
5.9%
rotor4
 
2.0%
dohcv1
 
0.5%

Most occurring characters

ValueCountFrequency (%)
o197
30.7%
h189
29.5%
c189
29.5%
f15
 
2.3%
v14
 
2.2%
d13
 
2.0%
l12
 
1.9%
r8
 
1.2%
t4
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter641
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o197
30.7%
h189
29.5%
c189
29.5%
f15
 
2.3%
v14
 
2.2%
d13
 
2.0%
l12
 
1.9%
r8
 
1.2%
t4
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin641
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o197
30.7%
h189
29.5%
c189
29.5%
f15
 
2.3%
v14
 
2.2%
d13
 
2.0%
l12
 
1.9%
r8
 
1.2%
t4
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII641
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o197
30.7%
h189
29.5%
c189
29.5%
f15
 
2.3%
v14
 
2.2%
d13
 
2.0%
l12
 
1.9%
r8
 
1.2%
t4
 
0.6%

num-of-cylinders
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
four
159 
six
24 
five
 
11
eight
 
5
two
 
4
Other values (2)
 
2

Length

Max length6
Median length4
Mean length3.902439024
Min length3

Characters and Unicode

Total characters800
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)1.0%

Sample

1st rowfour
2nd rowfour
3rd rowsix
4th rowfour
5th rowfive

Common Values

ValueCountFrequency (%)
four159
77.6%
six24
 
11.7%
five11
 
5.4%
eight5
 
2.4%
two4
 
2.0%
three1
 
0.5%
twelve1
 
0.5%

Length

2022-06-27T12:53:37.236466image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-27T12:53:37.347561image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
four159
77.6%
six24
 
11.7%
five11
 
5.4%
eight5
 
2.4%
two4
 
2.0%
three1
 
0.5%
twelve1
 
0.5%

Most occurring characters

ValueCountFrequency (%)
f170
21.2%
o163
20.4%
r160
20.0%
u159
19.9%
i40
 
5.0%
s24
 
3.0%
x24
 
3.0%
e20
 
2.5%
v12
 
1.5%
t11
 
1.4%
Other values (4)17
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter800
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f170
21.2%
o163
20.4%
r160
20.0%
u159
19.9%
i40
 
5.0%
s24
 
3.0%
x24
 
3.0%
e20
 
2.5%
v12
 
1.5%
t11
 
1.4%
Other values (4)17
 
2.1%

Most occurring scripts

ValueCountFrequency (%)
Latin800
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f170
21.2%
o163
20.4%
r160
20.0%
u159
19.9%
i40
 
5.0%
s24
 
3.0%
x24
 
3.0%
e20
 
2.5%
v12
 
1.5%
t11
 
1.4%
Other values (4)17
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f170
21.2%
o163
20.4%
r160
20.0%
u159
19.9%
i40
 
5.0%
s24
 
3.0%
x24
 
3.0%
e20
 
2.5%
v12
 
1.5%
t11
 
1.4%
Other values (4)17
 
2.1%

engine-size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct44
Distinct (%)21.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean126.9073171
Minimum61
Maximum326
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-06-27T12:53:37.454654image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum61
5-th percentile90
Q197
median120
Q3141
95-th percentile201.2
Maximum326
Range265
Interquartile range (IQR)44

Descriptive statistics

Standard deviation41.64269344
Coefficient of variation (CV)0.3281346923
Kurtosis5.305682092
Mean126.9073171
Median Absolute Deviation (MAD)23
Skewness1.947655045
Sum26016
Variance1734.113917
MonotonicityNot monotonic
2022-06-27T12:53:37.569752image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
12215
 
7.3%
9215
 
7.3%
9714
 
6.8%
9814
 
6.8%
10813
 
6.3%
9012
 
5.9%
11012
 
5.9%
1098
 
3.9%
1207
 
3.4%
1417
 
3.4%
Other values (34)88
42.9%
ValueCountFrequency (%)
611
 
0.5%
703
 
1.5%
791
 
0.5%
801
 
0.5%
9012
5.9%
915
 
2.4%
9215
7.3%
9714
6.8%
9814
6.8%
1031
 
0.5%
ValueCountFrequency (%)
3261
 
0.5%
3081
 
0.5%
3041
 
0.5%
2582
 
1.0%
2342
 
1.0%
2093
1.5%
2031
 
0.5%
1943
1.5%
1834
2.0%
1816
2.9%

fuel-system
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
mpfi
94 
2bbl
66 
idi
20 
1bbl
11 
spdi
 
9
Other values (3)
 
5

Length

Max length4
Median length4
Mean length3.897560976
Min length3

Characters and Unicode

Total characters799
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)1.0%

Sample

1st rowmpfi
2nd rowmpfi
3rd rowmpfi
4th rowmpfi
5th rowmpfi

Common Values

ValueCountFrequency (%)
mpfi94
45.9%
2bbl66
32.2%
idi20
 
9.8%
1bbl11
 
5.4%
spdi9
 
4.4%
4bbl3
 
1.5%
mfi1
 
0.5%
spfi1
 
0.5%

Length

2022-06-27T12:53:37.672841image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-27T12:53:37.781934image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
mpfi94
45.9%
2bbl66
32.2%
idi20
 
9.8%
1bbl11
 
5.4%
spdi9
 
4.4%
4bbl3
 
1.5%
mfi1
 
0.5%
spfi1
 
0.5%

Most occurring characters

ValueCountFrequency (%)
b160
20.0%
i145
18.1%
p104
13.0%
f96
12.0%
m95
11.9%
l80
10.0%
266
8.3%
d29
 
3.6%
111
 
1.4%
s10
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter719
90.0%
Decimal Number80
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
b160
22.3%
i145
20.2%
p104
14.5%
f96
13.4%
m95
13.2%
l80
11.1%
d29
 
4.0%
s10
 
1.4%
Decimal Number
ValueCountFrequency (%)
266
82.5%
111
 
13.8%
43
 
3.8%

Most occurring scripts

ValueCountFrequency (%)
Latin719
90.0%
Common80
 
10.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
b160
22.3%
i145
20.2%
p104
14.5%
f96
13.4%
m95
13.2%
l80
11.1%
d29
 
4.0%
s10
 
1.4%
Common
ValueCountFrequency (%)
266
82.5%
111
 
13.8%
43
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII799
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
b160
20.0%
i145
18.1%
p104
13.0%
f96
12.0%
m95
11.9%
l80
10.0%
266
8.3%
d29
 
3.6%
111
 
1.4%
s10
 
1.3%

bore
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct39
Distinct (%)19.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
3.62
23 
3.19
20 
3.15
15 
3.03
 
12
2.97
 
12
Other values (34)
123 

Length

Max length4
Median length4
Mean length3.941463415
Min length1

Characters and Unicode

Total characters808
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)4.4%

Sample

1st row3.47
2nd row3.47
3rd row2.68
4th row3.19
5th row3.19

Common Values

ValueCountFrequency (%)
3.6223
 
11.2%
3.1920
 
9.8%
3.1515
 
7.3%
3.0312
 
5.9%
2.9712
 
5.9%
3.469
 
4.4%
3.318
 
3.9%
3.788
 
3.9%
3.438
 
3.9%
3.277
 
3.4%
Other values (29)83
40.5%

Length

2022-06-27T12:53:37.888526image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3.6223
 
11.2%
3.1920
 
9.8%
3.1515
 
7.3%
3.0312
 
5.9%
2.9712
 
5.9%
3.469
 
4.4%
3.318
 
3.9%
3.788
 
3.9%
3.438
 
3.9%
2.917
 
3.4%
Other values (29)83
40.5%

Most occurring characters

ValueCountFrequency (%)
3225
27.8%
.201
24.9%
161
 
7.5%
256
 
6.9%
953
 
6.6%
543
 
5.3%
741
 
5.1%
638
 
4.7%
034
 
4.2%
434
 
4.2%
Other values (2)22
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number603
74.6%
Other Punctuation205
 
25.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3225
37.3%
161
 
10.1%
256
 
9.3%
953
 
8.8%
543
 
7.1%
741
 
6.8%
638
 
6.3%
034
 
5.6%
434
 
5.6%
818
 
3.0%
Other Punctuation
ValueCountFrequency (%)
.201
98.0%
?4
 
2.0%

Most occurring scripts

ValueCountFrequency (%)
Common808
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3225
27.8%
.201
24.9%
161
 
7.5%
256
 
6.9%
953
 
6.6%
543
 
5.3%
741
 
5.1%
638
 
4.7%
034
 
4.2%
434
 
4.2%
Other values (2)22
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII808
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3225
27.8%
.201
24.9%
161
 
7.5%
256
 
6.9%
953
 
6.6%
543
 
5.3%
741
 
5.1%
638
 
4.7%
034
 
4.2%
434
 
4.2%
Other values (2)22
 
2.7%

stroke
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct37
Distinct (%)18.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
3.40
20 
3.23
14 
3.15
14 
3.03
14 
3.39
 
13
Other values (32)
130 

Length

Max length4
Median length4
Mean length3.941463415
Min length1

Characters and Unicode

Total characters808
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)3.4%

Sample

1st row2.68
2nd row2.68
3rd row3.47
4th row3.40
5th row3.40

Common Values

ValueCountFrequency (%)
3.4020
 
9.8%
3.2314
 
6.8%
3.1514
 
6.8%
3.0314
 
6.8%
3.3913
 
6.3%
2.6411
 
5.4%
3.299
 
4.4%
3.359
 
4.4%
3.468
 
3.9%
3.116
 
2.9%
Other values (27)87
42.4%

Length

2022-06-27T12:53:37.984108image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3.4020
 
9.8%
3.1514
 
6.8%
3.0314
 
6.8%
3.2314
 
6.8%
3.3913
 
6.3%
2.6411
 
5.4%
3.299
 
4.4%
3.359
 
4.4%
3.468
 
3.9%
3.586
 
2.9%
Other values (27)87
42.4%

Most occurring characters

ValueCountFrequency (%)
3226
28.0%
.201
24.9%
460
 
7.4%
260
 
7.4%
059
 
7.3%
147
 
5.8%
544
 
5.4%
936
 
4.5%
633
 
4.1%
721
 
2.6%
Other values (2)21
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number603
74.6%
Other Punctuation205
 
25.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3226
37.5%
460
 
10.0%
260
 
10.0%
059
 
9.8%
147
 
7.8%
544
 
7.3%
936
 
6.0%
633
 
5.5%
721
 
3.5%
817
 
2.8%
Other Punctuation
ValueCountFrequency (%)
.201
98.0%
?4
 
2.0%

Most occurring scripts

ValueCountFrequency (%)
Common808
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3226
28.0%
.201
24.9%
460
 
7.4%
260
 
7.4%
059
 
7.3%
147
 
5.8%
544
 
5.4%
936
 
4.5%
633
 
4.1%
721
 
2.6%
Other values (2)21
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII808
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3226
28.0%
.201
24.9%
460
 
7.4%
260
 
7.4%
059
 
7.3%
147
 
5.8%
544
 
5.4%
936
 
4.5%
633
 
4.1%
721
 
2.6%
Other values (2)21
 
2.6%

compression-ratio
Real number (ℝ≥0)

HIGH CORRELATION

Distinct32
Distinct (%)15.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.14253659
Minimum7
Maximum23
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-06-27T12:53:38.078189image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile7.5
Q18.6
median9
Q39.4
95-th percentile21.82
Maximum23
Range16
Interquartile range (IQR)0.8

Descriptive statistics

Standard deviation3.972040322
Coefficient of variation (CV)0.3916219861
Kurtosis5.233054348
Mean10.14253659
Median Absolute Deviation (MAD)0.4
Skewness2.610862458
Sum2079.22
Variance15.77710432
MonotonicityNot monotonic
2022-06-27T12:53:38.172770image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
946
22.4%
9.426
12.7%
8.514
 
6.8%
9.513
 
6.3%
9.311
 
5.4%
8.79
 
4.4%
88
 
3.9%
9.28
 
3.9%
77
 
3.4%
8.65
 
2.4%
Other values (22)58
28.3%
ValueCountFrequency (%)
77
3.4%
7.55
 
2.4%
7.64
 
2.0%
7.72
 
1.0%
7.81
 
0.5%
88
3.9%
8.12
 
1.0%
8.33
 
1.5%
8.45
 
2.4%
8.514
6.8%
ValueCountFrequency (%)
235
2.4%
22.71
 
0.5%
22.53
1.5%
221
 
0.5%
21.91
 
0.5%
21.54
2.0%
215
2.4%
11.51
 
0.5%
10.11
 
0.5%
103
1.5%

horsepower
Categorical

HIGH CARDINALITY
HIGH CORRELATION
HIGH CORRELATION

Distinct60
Distinct (%)29.3%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
68
19 
70
 
11
69
 
10
116
 
9
110
 
8
Other values (55)
148 

Length

Max length3
Median length2
Mean length2.448780488
Min length1

Characters and Unicode

Total characters502
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)9.8%

Sample

1st row111
2nd row111
3rd row154
4th row102
5th row115

Common Values

ValueCountFrequency (%)
6819
 
9.3%
7011
 
5.4%
6910
 
4.9%
1169
 
4.4%
1108
 
3.9%
957
 
3.4%
886
 
2.9%
626
 
2.9%
1016
 
2.9%
1606
 
2.9%
Other values (50)117
57.1%

Length

2022-06-27T12:53:38.279361image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
6819
 
9.3%
7011
 
5.4%
6910
 
4.9%
1169
 
4.4%
1108
 
3.9%
957
 
3.4%
1016
 
2.9%
1146
 
2.9%
1606
 
2.9%
626
 
2.9%
Other values (50)117
57.1%

Most occurring characters

ValueCountFrequency (%)
1133
26.5%
673
14.5%
858
11.6%
052
 
10.4%
249
 
9.8%
535
 
7.0%
732
 
6.4%
931
 
6.2%
427
 
5.4%
310
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number500
99.6%
Other Punctuation2
 
0.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1133
26.6%
673
14.6%
858
11.6%
052
 
10.4%
249
 
9.8%
535
 
7.0%
732
 
6.4%
931
 
6.2%
427
 
5.4%
310
 
2.0%
Other Punctuation
ValueCountFrequency (%)
?2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common502
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1133
26.5%
673
14.5%
858
11.6%
052
 
10.4%
249
 
9.8%
535
 
7.0%
732
 
6.4%
931
 
6.2%
427
 
5.4%
310
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII502
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1133
26.5%
673
14.5%
858
11.6%
052
 
10.4%
249
 
9.8%
535
 
7.0%
732
 
6.4%
931
 
6.2%
427
 
5.4%
310
 
2.0%

peak-rpm
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct24
Distinct (%)11.7%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
5500
37 
4800
36 
5000
27 
5200
23 
5400
13 
Other values (19)
69 

Length

Max length4
Median length4
Mean length3.970731707
Min length1

Characters and Unicode

Total characters814
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)2.4%

Sample

1st row5000
2nd row5000
3rd row5000
4th row5500
5th row5500

Common Values

ValueCountFrequency (%)
550037
18.0%
480036
17.6%
500027
13.2%
520023
11.2%
540013
 
6.3%
60009
 
4.4%
52507
 
3.4%
45007
 
3.4%
58007
 
3.4%
42005
 
2.4%
Other values (14)34
16.6%

Length

2022-06-27T12:53:38.376945image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
550037
18.0%
480036
17.6%
500027
13.2%
520023
11.2%
540013
 
6.3%
60009
 
4.4%
52507
 
3.4%
45007
 
3.4%
58007
 
3.4%
42005
 
2.4%
Other values (14)34
16.6%

Most occurring characters

ValueCountFrequency (%)
0417
51.2%
5192
23.6%
485
 
10.4%
843
 
5.3%
238
 
4.7%
615
 
1.8%
18
 
1.0%
75
 
0.6%
35
 
0.6%
94
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number812
99.8%
Other Punctuation2
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0417
51.4%
5192
23.6%
485
 
10.5%
843
 
5.3%
238
 
4.7%
615
 
1.8%
18
 
1.0%
75
 
0.6%
35
 
0.6%
94
 
0.5%
Other Punctuation
ValueCountFrequency (%)
?2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common814
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0417
51.2%
5192
23.6%
485
 
10.4%
843
 
5.3%
238
 
4.7%
615
 
1.8%
18
 
1.0%
75
 
0.6%
35
 
0.6%
94
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII814
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0417
51.2%
5192
23.6%
485
 
10.4%
843
 
5.3%
238
 
4.7%
615
 
1.8%
18
 
1.0%
75
 
0.6%
35
 
0.6%
94
 
0.5%

city-mpg
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct29
Distinct (%)14.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.2195122
Minimum13
Maximum49
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-06-27T12:53:38.472527image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum13
5-th percentile16
Q119
median24
Q330
95-th percentile37
Maximum49
Range36
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.542141653
Coefficient of variation (CV)0.2594079379
Kurtosis0.5786483405
Mean25.2195122
Median Absolute Deviation (MAD)5
Skewness0.6637040288
Sum5170
Variance42.79961741
MonotonicityNot monotonic
2022-06-27T12:53:38.571612image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
3128
13.7%
1927
13.2%
2422
10.7%
2714
 
6.8%
1713
 
6.3%
2612
 
5.9%
2312
 
5.9%
218
 
3.9%
258
 
3.9%
308
 
3.9%
Other values (19)53
25.9%
ValueCountFrequency (%)
131
 
0.5%
142
 
1.0%
153
 
1.5%
166
 
2.9%
1713
6.3%
183
 
1.5%
1927
13.2%
203
 
1.5%
218
 
3.9%
224
 
2.0%
ValueCountFrequency (%)
491
 
0.5%
471
 
0.5%
451
 
0.5%
387
3.4%
376
2.9%
361
 
0.5%
351
 
0.5%
341
 
0.5%
331
 
0.5%
321
 
0.5%

highway-mpg
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct30
Distinct (%)14.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.75121951
Minimum16
Maximum54
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-06-27T12:53:38.677202image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum16
5-th percentile22
Q125
median30
Q334
95-th percentile42.8
Maximum54
Range38
Interquartile range (IQR)9

Descriptive statistics

Standard deviation6.886443131
Coefficient of variation (CV)0.2239404889
Kurtosis0.4400703815
Mean30.75121951
Median Absolute Deviation (MAD)5
Skewness0.5399971879
Sum6304
Variance47.423099
MonotonicityNot monotonic
2022-06-27T12:53:38.775286image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
2519
 
9.3%
3817
 
8.3%
2417
 
8.3%
3016
 
7.8%
3216
 
7.8%
3414
 
6.8%
3713
 
6.3%
2813
 
6.3%
2910
 
4.9%
339
 
4.4%
Other values (20)61
29.8%
ValueCountFrequency (%)
162
 
1.0%
171
 
0.5%
182
 
1.0%
192
 
1.0%
202
 
1.0%
228
3.9%
237
 
3.4%
2417
8.3%
2519
9.3%
263
 
1.5%
ValueCountFrequency (%)
541
 
0.5%
531
 
0.5%
501
 
0.5%
472
 
1.0%
462
 
1.0%
434
 
2.0%
423
 
1.5%
413
 
1.5%
392
 
1.0%
3817
8.3%

price
Categorical

HIGH CARDINALITY
UNIFORM

Distinct187
Distinct (%)91.2%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
?
 
4
8921
 
2
18150
 
2
8845
 
2
8495
 
2
Other values (182)
193 

Length

Max length5
Median length5
Mean length4.443902439
Min length1

Characters and Unicode

Total characters911
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique171 ?
Unique (%)83.4%

Sample

1st row13495
2nd row16500
3rd row16500
4th row13950
5th row17450

Common Values

ValueCountFrequency (%)
?4
 
2.0%
89212
 
1.0%
181502
 
1.0%
88452
 
1.0%
84952
 
1.0%
76092
 
1.0%
66922
 
1.0%
62292
 
1.0%
79572
 
1.0%
77752
 
1.0%
Other values (177)183
89.3%

Length

2022-06-27T12:53:38.883880image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
4
 
2.0%
77752
 
1.0%
89212
 
1.0%
134992
 
1.0%
78982
 
1.0%
165002
 
1.0%
92792
 
1.0%
55722
 
1.0%
72952
 
1.0%
79572
 
1.0%
Other values (177)183
89.3%

Most occurring characters

ValueCountFrequency (%)
9152
16.7%
1123
13.5%
5123
13.5%
898
10.8%
082
9.0%
673
8.0%
271
7.8%
769
7.6%
466
7.2%
350
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number907
99.6%
Other Punctuation4
 
0.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9152
16.8%
1123
13.6%
5123
13.6%
898
10.8%
082
9.0%
673
8.0%
271
7.8%
769
7.6%
466
7.3%
350
 
5.5%
Other Punctuation
ValueCountFrequency (%)
?4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common911
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9152
16.7%
1123
13.5%
5123
13.5%
898
10.8%
082
9.0%
673
8.0%
271
7.8%
769
7.6%
466
7.2%
350
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII911
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9152
16.7%
1123
13.5%
5123
13.5%
898
10.8%
082
9.0%
673
8.0%
271
7.8%
769
7.6%
466
7.2%
350
 
5.5%

Interactions

2022-06-27T12:53:31.872364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:22.658459image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:23.776918image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:24.743248image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:25.758118image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:26.828036image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:27.782355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:28.716656image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:29.756548image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:30.900030image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:31.963442image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:22.764049image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:23.874002image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:24.833825image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:25.851198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:26.918614image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:27.872933image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:28.805733image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:29.851130image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:30.995612image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:32.063528image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:22.855628image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:23.972086image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:24.933411image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:25.947280image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:27.015197image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:27.967013image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:28.903316image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:29.951216image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:31.095697image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:32.161612image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:22.949708image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:24.069169image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:25.027992image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:26.155459image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:27.111779image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:28.062095image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:28.998898image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:30.049800image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:31.196284image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:32.257194image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:23.047292image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:24.168254image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:25.125575image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:26.251041image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:27.202857image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:28.153173image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:29.091978image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:30.290507image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:31.291866image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:32.377797image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:23.152383image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:24.265337image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:25.253686image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:26.346623image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:27.296938image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:28.244751image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:29.184057image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:30.400601image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:31.387948image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:32.508410image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:23.353555image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:24.357917image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:25.352269image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:26.439703image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:27.388016image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:28.332827image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:29.283643image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:30.500187image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:31.480528image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:32.615502image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:23.461648image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:24.449495image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:25.447852image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:26.532282image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:27.479595image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:28.422904image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:29.402745image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:30.592766image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:31.573107image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:32.725596image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:23.564236image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:24.547579image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:25.548438image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:26.630366image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:27.578180image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:28.521489image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:29.516843image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:30.688848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:31.674695image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:32.834690image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:23.665323image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:24.644663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:25.653028image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:26.725948image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:27.674262image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:28.618572image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:29.649957image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:30.792938image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-06-27T12:53:31.771277image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-06-27T12:53:38.977960image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-06-27T12:53:39.115579image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-06-27T12:53:39.252196image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-06-27T12:53:39.395318image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-06-27T12:53:39.569968image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-06-27T12:53:33.051376image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-06-27T12:53:33.693426image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

symbolingnormalized-lossesmakefuel-typeaspirationnum-of-doorsbody-styledrive-wheelsengine-locationwheel-baselengthwidthheightcurb-weightengine-typenum-of-cylindersengine-sizefuel-systemborestrokecompression-ratiohorsepowerpeak-rpmcity-mpghighway-mpgprice
03?alfa-romerogasstdtwoconvertiblerwdfront88.6168.864.148.82548dohcfour130mpfi3.472.689.01115000212713495
13?alfa-romerogasstdtwoconvertiblerwdfront88.6168.864.148.82548dohcfour130mpfi3.472.689.01115000212716500
21?alfa-romerogasstdtwohatchbackrwdfront94.5171.265.552.42823ohcvsix152mpfi2.683.479.01545000192616500
32164audigasstdfoursedanfwdfront99.8176.666.254.32337ohcfour109mpfi3.193.4010.01025500243013950
42164audigasstdfoursedan4wdfront99.4176.666.454.32824ohcfive136mpfi3.193.408.01155500182217450
52?audigasstdtwosedanfwdfront99.8177.366.353.12507ohcfive136mpfi3.193.408.51105500192515250
61158audigasstdfoursedanfwdfront105.8192.771.455.72844ohcfive136mpfi3.193.408.51105500192517710
71?audigasstdfourwagonfwdfront105.8192.771.455.72954ohcfive136mpfi3.193.408.51105500192518920
81158audigasturbofoursedanfwdfront105.8192.771.455.93086ohcfive131mpfi3.133.408.31405500172023875
90?audigasturbotwohatchback4wdfront99.5178.267.952.03053ohcfive131mpfi3.133.407.016055001622?

Last rows

symbolingnormalized-lossesmakefuel-typeaspirationnum-of-doorsbody-styledrive-wheelsengine-locationwheel-baselengthwidthheightcurb-weightengine-typenum-of-cylindersengine-sizefuel-systemborestrokecompression-ratiohorsepowerpeak-rpmcity-mpghighway-mpgprice
195-174volvogasstdfourwagonrwdfront104.3188.867.257.53034ohcfour141mpfi3.783.159.51145400232813415
196-2103volvogasstdfoursedanrwdfront104.3188.867.256.22935ohcfour141mpfi3.783.159.51145400242815985
197-174volvogasstdfourwagonrwdfront104.3188.867.257.53042ohcfour141mpfi3.783.159.51145400242816515
198-2103volvogasturbofoursedanrwdfront104.3188.867.256.23045ohcfour130mpfi3.623.157.51625100172218420
199-174volvogasturbofourwagonrwdfront104.3188.867.257.53157ohcfour130mpfi3.623.157.51625100172218950
200-195volvogasstdfoursedanrwdfront109.1188.868.955.52952ohcfour141mpfi3.783.159.51145400232816845
201-195volvogasturbofoursedanrwdfront109.1188.868.855.53049ohcfour141mpfi3.783.158.71605300192519045
202-195volvogasstdfoursedanrwdfront109.1188.868.955.53012ohcvsix173mpfi3.582.878.81345500182321485
203-195volvodieselturbofoursedanrwdfront109.1188.868.955.53217ohcsix145idi3.013.4023.01064800262722470
204-195volvogasturbofoursedanrwdfront109.1188.868.955.53062ohcfour141mpfi3.783.159.51145400192522625